Ser. No. 10/087.145 
Amendment in Response to Office Action of 7/27/04 

Amendments to the Claims 

The following list of claims will replace all prior versions of the claims in the application: 

1. {Currently amended) A method for training a kernel-based learning machine using a 
dataset comprising: 

filling a kernel matrix with a plurality of kernels, each kernel comprising a 
pairwise similarity between a pair of data points within a plurality of data points in the dataset; 

defining a fully-connected graph comprising a plurality of nodes and a plurality of edges 
connecting at least a portion of the plurality of nodes with other nodes of the plurality, each edge 
of the plurality of edges having a weight equal to the kernel between a corresponding pair of data 
points, wherein the graph has an adjacency matrix that is equivalent to the kernel matrix; 

computing a plurality of eigenvalues for the kernel matrix; 

selecting an a first eigenvector corresponding to the smallest non-zero eigenvalue of the 
plurality of eigenvalues; 

bisecting the dataset into two classes using the s e l e ct e d first eigenvector; 

aligning the kernels using a second eigenvector so that the two classes have equal 
probability; and 

training selecting an optimized kernel for use in the k e rn e l bas e d learning machine using 
at l e ast a portion of the bis e ct e d datas e t , wherein the optimized kernel produces maximal kernel 
alignment. 

2. (Currently amended) The method of claim 1, further comprising, after computing a 
the plurality of eigenvalues, determining a number of clusters of data points within the dataset by 
identifying all zero eigenvalues. 

3. (Currently amended) The method of claim 1, further comprising: 
computing a s e cond e ig e nv e ctor; and 

minimizing a cut cost for bisecting the dataset by applying a threshold to the second 
eigenvector. 
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4. (Original) The method of claim 3, wherein the threshold limits the second 
eigenvector to entries of -1 and +1. 

5. (Currently amended) The method of claim 1, wherein the data points within the 
dataset are unlabeled and the step of bisecting the dataset comprises assigning dividing up the 
data points to a cluster of among a plurality of clusters. 

6. (Original) The method of claim 1, wherein the data points within a first portion of the 
dataset are labeled and the data points of a second portion of the dataset are unlabeled, and 
wherein the step of filling the kernel matrix comprises: 

selecting a kernel K; 

normalizing the selected kernel Kto-l<K<+l; and 

if both data points of a pair come from the first portion of the dataset, the corresponding 
kernel comprises a labels vector. 

7. (Currently amended) The method of claim 6, further comprising: 
calculating a s e cond e ig e nvector of the k e rn e l matrix to obtain an alignment; 
thresholding the second eigenvector; and 

based on the alignment, assigning labels to the unlabeled data points. 

8. The method of claim 7, further comprising adjusting at least a portion of the plurality 
of kernels to align the second eigenvector with a pre-determined label. 

9. The method of claim 1, further comprising, prior to computing a plurality of 
eigenvalues, computing a first eigenvector and assigning a rank to each of the plurality of data 
points based on popularity. 

10. (Original) The method of claim 9, further comprising identifying as dirty any data 
points of the plurality having a low rank. 
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1 1 . (Original) The method of claim 10, further comprising cleaning the dirty data 

points. 

12. (Currently amended) A spectral kernel machine comprising: 

at least one kernel selected from a plurality of kernels for mapping data into a feature 
space, the at least one kernel selected by training the plurality of kernels on a dataset comprising 
a plurality of data points wherein the dataset is divided into a plurality of clusters by applying 
spectral graph theory to the dataset and selecting the at least one kernel that is optimally aligned 
with the division between the plurality of clusters , wherein optimal alignment is achieved by 
requiring that the probability of the plurality of clusters be the same . 

13. (Original) The spectral kernel machine of claim 12, wherein the division between 
the plurality of clusters is determined by a first eigenvector in an adjacency matrix corresponding 
to a graph comprising a plurality of nodes comprising the plurality of data points. 

14. (Original) The spectral kernel machine of claim 12, wherein the dataset is unlabeled. 

15. (Original) The spectral kernel machine of claim 12, wherein the dataset is partially 

labeled. 

16. (Currently amended) A spectral kernel machine comprising: 

at least one kernel selected from a plurality of kernels for mapping data into a feature 
space, the at least one kernel selected by training a pplying the plurality of kernels en to a dataset 
comprising a plurality of data points wherein the dataset is bisected into a plurality of clusters by 
applying spectral graph theory to the dataset and selecting the at least one kernel that minimizes 
a cut cost in the dichotomy partitioning of the data points between the plurality of clusters^ 
wherein the probability of the plurality of clusters is the same . 
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17. (Currently amended) The spectral kernel machine of claim 16, wherein the 
dichotomy partitioning of the data points between the plurality of clusters is determined by a first 
eigenvector in an adjacency matrix corresponding to a graph comprising a plurality of nodes 
comprising the plurality of data points and wherein a second eigenvector determines maximal 
alignment by imposing a constraint that the probability of the plurality of clusters be the same . 



